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Barley {Hordeum vulgare) is one of the world's most important cereal crops. Although its large and complex 
genome has held back barley genomics for quite a while, the whole genome sequence was released in 2012 
by the International Barley Genome Sequencing Consortium (IBSC). Moreover, more than 30,000 barley 
full-length cDNAs (FLcDNAs) are now available in the public domain. Here we present the Barley Gene 
Expression Database (bex-db: http://barleyflc.dna.affrc.go.jp/bexdb/index.html) as a repository of transcrip- 
tome data including the sequences and the expression profiles of barley genes resulting from microarray anal- 
ysis. In addition to FLcDNA sequences, bex-db also contains partial sequences of more than 309,000 novel 
expressed sequence tags (ESTs). Users can browse the data via keyword, sequence homology and expression 
profile search options. A genome browser was also developed to display the chromosomal locations of barley 
FLcDNAs and wheat (Triticum aestivum) transcripts as well as Aegilops tauschii gene models on the IBSC 
genome sequence for future comparative analysis of orthologs among Triticeae species. The bex-db should 
provide a useful resource for further genomics studies and development of genome-based tools to enhance 
the progress of the genetic improvement of cereal crops. 
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Introduction 

Among the major cereal crops, barley (Hordeum vulgare) is 
ranked fourth in worldwide production behind wheat, rice 
and maize (http://faostat.fao.org). However, the large ge- 
nome size of about 5.1 gigabases (Gb) and the high rate of 
repetitive elements (>80%) of this important crop have 
largely hindered the development of genomic studies for 
many years. Ahead of barley genome sequencing, large- 
scale analysis of full-length cDNAs (FLcDNAs) derived 
from the Japanese malting barley variety 'Haruna Nijo' was 
conducted in Japan (Matsumoto et al. 2011, Sato et al. 
2009). An accompanying database was also developed to 
provide access to the clones and sequence information. The 
FLcDNA information accelerates molecular and evolution- 
ary studies, particularly of barley genes such as the 
CONSTANS-like (COL) gene family, which is known to 
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control flowering (Cockram et al. 2012, Kikuchi et al. 
2012). In 2012, IBSC released the whole genome sequence 
of barley obtained from a malting barley variety 'Morex' 
(The International Barley Genome Sequencing Consortium 
2012). This provides an extensive opportunity for more 
comprehensive characterization of the genomic sequences 
on each chromosome, and insights into the overall structure 
and function of the entire genome. Barley has become a 
model organism for understanding the structure and function 
of Triticeae genomes and developing genomics tools for 
future improvement of crops. 

We have upgraded the barley FLcDNA database to pro- 
vide a more robust repository of information on barley gene 
structures and gene functions. The bex-db currently contains 
all data information including the sequence and annotation 
of FLcDNAs, the gene expression profiles based on micro- 
array analysis using various experimental materials and con- 
ditions and the sequences of novel ESTs of barley. Here, we 
report the important contents and major features of our up- 
graded database with the detailed demonstration of various 
options for data access. 
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Database contents 

The bex-db currently contains data on nucleotide sequences, 
microarray gene expression profiles and structural features 
of expressed genes in barley (Fig. 1). The nucleotide se- 
quence data, which all contain clone IDs, include full-length 
sequences (FLcDNA) or partial length sequences of clones 
corresponding to the ESTs. 

Nucleotide sequences 

Information for construction and sequencing of 12 
'Haruna Nijo' cDNA libraries is available on the top page 
(Project Summary) of bex-db (http://barleyflc.dna.affrc.go. 
jp/bexdb/pages/help/project_summary.html). FLcDNA data 
represent the completed nucleotide sequences analyzed in 
4,999 clones (FLbaf) by Okayama University (http://www. 
shigen.nig.ac.jp/barley/) and 24,783 clones (NIASHv) by 
the National Institute of Agrobiological Sciences, respec- 
tively (Fig. 1) (Sato et al. 2009, Matsumoto et al. 2011). 




beX-db 
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Fig. 1. Dataset of barley FLcDNAs and ESTs in bex-db. Our bex-db 
currently contains the data of FLcDNA and EST sequences obtained 
from 12 cDNA libraries of a barley variety 'Haruna Nijo,' respective- 
ly, by Okayama University (Flbas FLcDNAs) and the National Insti- 
tute of Agrobiological Sciences (NIAS FLcDNAs), and data on gene 
expression profiles based on microarray analysis using different exper- 
imental materials and conditions. 



Clustering of the above FLcDNAs was performed as report- 
ed previously, which led to the identification of 4,543 clus- 
ters (FL_CL) and 18,1 17 singletons (FL_OP) in the current 
database (Matsumoto et al. 2011). To attach annotations to 
the above FLcDNA sequences, we predicted the open read- 
ing frames and gene functions by BLASTX search (Altschul 
et al. 1990) using the RefSeq database (NCBI Resource 
Coordinators 2013) and the UniProtKB database (Magrane 
and UniProt Consortium 2011). The InterProScan software 
(Mulder and Apweiler 2007) was also used to assign both 
InterPro domains (http://www.ebi.ac.uk/interpro/) and Gene 
Ontology annotations (http://www.geneontology.org/) to the 
FLcDNA sequences (Fig. 2A). 

Additionally, we released the sequences of 309,1 17 new 
ESTs (DK584720-DK887267) derived from 167,596 
cDNA clones to the bex-db at this study (Fig. 1). The above 
ESTs constituted 141,521 pairs of 5'-end and 3'-end se- 
quences and 26,075 single sequences from either the 5'- or 
3'-end of the clones. On the basis of their sequence similarity 
analyzed using the EST clustering programs of TGICL 
(http://compbio.dfci.harvard.edu/tgi/software/) and CAP3 
(Huang and Madan 1999) or our in-house re-clustering 
program when necessary, we were able to construct 27,562 
contigs (Hv-Contig) that could be visualized through the 
"Contig viewer" of our bex-db. Alignments and consensus 
sequences of these EST contigs are downloadable through 
the database. We also constructed 22,148 contig clusters 
(EST CL), basing on the information of paired ends 
(http://barleyflc.dna.affrc.go.jp/bexdb/pages/help/clustering 
_method.html), to be displayed in a "Cluster viewer". On the 
other hand, 3,380 EST sequences (EST OP) are known to 
be present as singletons within the database. Library names 
and clone information for each EST were also included in 
the above viewers. In order to provide users with more in- 
formation on the predicted functions of barley genes, more- 
over, a BLASTX search (Altschul et al. 1990) of all consen- 
sus sequences of EST contigs (Hv-contigs) against the 
RefSeq database (NCBI Resource Coordinators 2013) and 
the UniProtKB database (Magrane and UniProt Consortium 
2011) was also conducted. This information can be easily 
accessed by a keyword search as described later. 

Microarray data 

A 60-mer oligonucleotide microarray was developed to 
characterize the gene expression levels in barely using dif- 
ferent experimental samples and conditions (Nakamura et 
al. in preparation). The 4x44K customized microarray plat- 
form (Agilent Technologies) contains the probes designed 
from 36,632 barley FLcDNAs. Our bex-db provides the 
current results of expression profiles of barley genes ob- 
tained from the above microarray analysis using the experi- 
mental samples and conditions as follows: root and shoot 
treated with or without abscisic acid (ABA), jasmonic acid 
(JA), cold, drought, aluminum and salt stress by 3 h, 6 h and 
24 h (Fig. 2B). For comparison of the expression profiles 
among different genes, we calculated Pearson's correlation 
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Fig. 2. Database interfaces of bex-db. (A) FLcDNA page provides de- 
tailed information on sequences and annotations such as ORFs, do- 
mains and gene ontology for each clone. (B) Expression profile page 
presents the overall view of expression levels of each barley gene re- 
sulting from microarray analysis of roots and shoots under the various 
experimental conditions. A list of positively or negatively correlated 
expressed genes, based on the calculation of Pearson's correlation co- 
efficient, is also provided. (C) Genome browser page provides the re- 
sults obtained from chromosomal mapping of the barley FLcDNAs to 
reveal the physical locations, structures and predicted functions of 
genes on the 'Morex' genome sequence. 

coefficient of clone pairs on the array using their relative ex- 
pression levels. A list of the top 100 genes (probes) with pos- 
itively or negatively correlated expression patterns to each 
other was created for each microarray dataset. 



Mapping of FLcDNAs onto the barley genome 

A genome browser to display the annotation data of bar- 
ley genes reported by IBSC has already been developed 
within a database, EnsemblPlants (http://plants.ensembl.org/ 
Hordeum vulgare/Info/Index/). This database, however, 
does not contain any information relating to the sequence 
analysis of barley FLcDNAs. To provide additional infor- 
mation on the barley genome, such as physical positions and 
structures of expressed genes or composition and chromo- 
somal distribution of repetitive sequences, we thereafter re- 
constructed the genome viewer with GBrowse 2.54 software 
(Stein et al. 2002) using the genomic sequence from the bar- 
ley variety 'Morex' published by IBSC (2012). The IBSC 
sequence of the barley genome currently consists of 
2,670,738 assembled contigs with 1.9 Gb nucleotides. These 
sequence contigs were then assigned to the short (HS) and 
long (HL) arms of all seven barley chromosomes except for 
chromosome 1H, according to the orders indicated in 
EnsemblPlants. Some contigs remained unassigned because 
their chromosomal positions could not be determined (vHS, 
vHL and unanchored contigs). After masking for repeats by 
the software tool CENSOR (http://www.girinst.org/ 
downloads/software/censor/) with mipsREdat_9.0p_ 
Poaceae TEs as described previously (Nussbaumer et al. 
2013), we were able to map 11,758 barley FLcDNA se- 
quences onto the barley chromosomes through homology 
searches using BLASTN (>95% identity and >90% cover- 
age) (Altschul et al. 1990) and EST2genome (Mott 1997) in 
the present study (Fig. 2C). 

Besides the above barley FLcDNAs, we were also able 
to map tentatively 50,554 sequences of the 83,382 wheat 
mRNAs deposited in DDBJ/EMBL/GenBank and 23,017 of 
the 43,150 gene models predicted in Aegilops tauschii (Jia 
et al. 2013) on the above barley genomic sequence (identity 
>70% and a coverage >50%). These important results should 
provide additional resources for future comparative and 
functional genomics studies within the Triticeae. 

Search functions 

The database contains convenient tools for users to access 
all data contents through a keyword search, sequence homol- 
ogy search and expression profile search from the top page, 
as described below (Fig. 1). All of the data are integrated by 
clone IDs, moreover, enabling users to see different results 
obtained from our experiments and analysis together with 
associated information through a cross-link function. 

Keyword search 

A keyword search can be used directly to obtain basic in- 
formation about ESTs and FLcDNAs, including their se- 
quences and functional annotations based on UniProtKB and 
RefSeq databases. Keywords such as accession numbers, 
clone IDs, gene names, or any word associated with the gene 
function can be used for the search. As a result, the top 50 
BLAST hits will be shown in a list. In this case, users can 
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choose to display the results obtained from all species or 
from a limited number of species and even a single species 
based on their interests. The above results can also be sorted 
easily by the BLAST hit score or its E value. 

Homology search 

A BLAST search engine (Altschul et al. 1990) is provid- 
ed to serve as the tool for comparative analysis of the gene 
sequences in this study. A search can be performed using the 
barley FLcDNA or EST sequences from the 'Haruna Nijo' 
variety and the whole genome sequence of the 'Morex' vari- 
ety. Depending on the query content, users can choose 
BLASTN or TBLASTX for analysis of the nucleotide se- 
quences and TBLASTN for analysis of the amino acid se- 
quences. These searches generate a list of homologous se- 
quences matched by the query, including the alignments. For 
each hit sequence, a link to the basic clone information or the 
genome browser is also provided for use. 

Expression profile search 

A search for the expression profiles of barley genes can 
be effectively initiated by specifying the samples (root/ 
shoot) of experimental materials, the experimental condi- 
tions and the time points of any treatment (3 h, 6 h, 24 h) 
used for microarray analysis and by setting the fold change 
and cutoff of expression levels. Results derived from the 
above keyword searches are shown in tabular format, indi- 
cating the IDs of hit clones and the exact fold changes under 
a given condition together with other information such as the 
accession numbers and cluster IDs of FLcDNAs, the IDs of 
EST contigs as well as the functional annotation based on 
BLASTX analysis against UniProtKB and RefSeq data- 
bases. There exists a link for each clone ID to its expression 
profile showing the full dataset obtained from microarray 
analysis in both tabular and graph formats and a list of its co- 
expressed genes. The above search function can be expand- 
ed, furthermore, using the "add" button to create expression 
profiles across a number of various treatments and experi- 
mental conditions. 

Genome browser 

To view genomic information about barley, users can 
choose either a homology search of sequences or direct ac- 
cess to the genome browser (GBrowse) from the top page of 
our bex-db. For homology searches, the IBSC genome 
sequence serves as a database which links the hit contigs 
directly from the BLAST result to the genome browser. A 
keyword search against GBrowse can be performed using 
the accession numbers of the FLcDNAs of interest to view 
their physical locations on the barley genome. Additionally, 
the IBSC gene IDs and the transcript or gene IDs of other 
different species can also be applied for the keyword search 
for the same purpose. All of the FLcDNA sequences mapped 
on the barley genome inside the GBrowse have a link to their 
clone information. It should be noted that not all of the bar- 
ley cDNA clones have been completely sequenced or used 



for microarray analysis to date. All of the cDNA clones, 
however, have information on EST sequences with contig 
and/or cluster IDs that should enable users to find other 
clones assembled within the same contigs or clusters to find 
data about FLcDNA sequences and gene expression pro- 
files. To enrich the above information, further chromosomal 
mapping of all EST sequences must be conducted. 

An example of verifying candidate genes or gene func- 
tions using bex-db 

The FLOWERING LOCUS T (F7>like genes are involved 
mainly in controlling flowering signals and are regulated by 
photoperiod and vernalization pathways in barley (Faure et 
al. 2007). A keyword search against our bex-db using 'FT- 
like protein' or a homology search using the HvFT2 protein 
sequence from UniProtKB (A0S6X4) led to one result for 
one FLcDNA clone Hv3018N13 (accession no. AK373041) 
(Fig. 2A). The expression profile indicates that this gene is 
strongly expressed under cold stress with more than 50-fold 
expression level in the shoots after 24hrs of treatment 
(Fig. 2B). This result can also be obtained by searching the 
data on gene expression profiles under certain experimental 
conditions, such as cold stress, shoot, 24 hrs and >50-fold 
changes. Chromosomal mapping of the above FLcDNA 
sequence by BLASTN analysis against the IBSC genomic 
sequence showed clearly that this gene was located within 
the Morex_contig_1558556 of barley chromosome 3HS, 
consistent with a previous report (Faure et al. 2007). By 
comparing the expression profiles among different genes, 
furthermore, a barley FLcDNA clone, NIASHvl098F14 
(accession no. AK359508), which was predicted to encode 
the gene for cold-regulated protein 1, was found to be listed 
among the top 10 best positively correlated genes according 
to Pearson's correlation coefficient calculation. The above 
results thereafter clearly demonstrated that the bex-db con- 
structed in the present study could be a useful tool not only 
for comparative genomics among cereal crops, but also for 
the future identification of gene functions in barley. 

Clone order 

The barley FLcDNAs clones are available for distribution by 
the National Institute of Agrobiological Sciences by sending 
a request to the NIAS DNA Bank (http://www.dna.affrc. 
go.jp/distribution/). 
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